Alphabet-dependent Parallel Algorithm for Suffix Tree Construction for Pattern Searching

نویسندگان

  • Freeson Kaniwa
  • Venu Madhav Kuthadi
  • Otlhapile Dinakenyane
  • Heiko Schroeder
چکیده

Suffix trees have recently become very successful data structures in handling large data sequences such as DNA or Protein sequences. Consequently parallel architectures have become ubiquitous. We present a novel alphabet-dependent parallel algorithm which attempts to take advantage of the perverseness of the multicore architecture. Microsatellites are important for their biological relevance hence our algorithm is based on time efficient construction for identification of such. We experimentally achieved up to 15x speedup over the sequential algorithm on different input sizes of biological sequences.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Optimal Logarithmic Time Randomized Suffix Tree Construction

The suffix tree of a string, the fundamental data structure in the area of combinatorial pattern matching, has many elegant applications. In this paper, we present a novel, simple sequential algorithm for the construction of suffix trees. We are also able to parallelize our algorithm so that we settle the main open problem in the construction of suffix trees: we give a Las Vegas CRCW PRAM algor...

متن کامل

Constructing Chromosome Scale Suffix Trees

Suffix trees have been the focus of significant research interest as they permit very efficient solutions to a range of string and sequence searching problems. Given a suffix tree that encodes a particular string, it is possible to solve problems such as searching for a specific pattern in time proportional to the length of the pattern rather than the length of the string. Suffix trees can also...

متن کامل

Constructing Genome Scale Suffix Trees

Suffix trees have been the focus of significant research interest as they permit very efficient solutions to a range of string and sequence searching problems. Given a suffix tree that encodes a particular string, it is possible to solve problems such as searching for a specific pattern in time proportional to the length of the pattern rather than the length of the string. Suffix trees can also...

متن کامل

Space-efficient K-mer Algorithm for Generalised Suffix Tree

Suffix trees have emerged to be very fast for pattern searching yielding O (m) time, where m is the pattern size. Unfortunately their high memory requirements make it impractical to work with huge amounts of data. We present a memory efficient algorithm of a generalized suffix tree which reduces the space size by a factor of 10 when the size of the pattern is known beforehand. Experiments on th...

متن کامل

Linear-Time Construction of Suffix Arrays

The time complexity of suffix tree construction has been shown to be equivalent to that of sorting: O(n) for a constant-size alphabet or an integer alphabet and O(n logn) for a general alphabet. However, previous algorithms for constructing suffix arrays have the time complexity of O(n logn) even for a constant-size alphabet. In this paper we present a linear-time algorithm to construct suffix ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1704.05660  شماره 

صفحات  -

تاریخ انتشار 2017